Classification Using Association Rules: Weaknesses and Enhancements

نویسندگان

  • Bing Liu
  • Yiming Ma
  • Ching-Kian Wong
چکیده

Existing classification and rule learning algorithms in machine learning mainly use heuristic/greedy search to find a subset of regularities (e.g., a decision tree or a set of rules) in data for classification. In the past few years, extensive research was done in the database community on learning rules using exhaustive search under the name of association rule mining. The objective there is to find all rules in data that satisfy the user-specified minimum support and minimum confidence. Although the whole set of rules may not be used directly for accurate classification, effective and efficient classifiers have been built using the rules. This paper aims to improve such an exhaustive search based classification system CBA (Classification Based on Associations). The main strength of this system is that it is able to use the most accurate rules for classification. However, it also has weaknesses. This paper proposes two new techniques to deal with these weaknesses. This results in remarkably accurate classifiers. Experiments on a set of 34 benchmark datasets show that on average the new techniques reduce the error of CBA by 17% and is superior to CBA on 26 of the 34 datasets. They reduce the error of the decision tree classifier C4.5 by 19%, and improve performance on 29 datasets. Similar good results are also achieved against the existing classification systems, RIPPER, LB and a Naïve-Bayes

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving an Association Rule Based Classifier

Existing classification algorithms in machine learning mainly use heuristic search to find a subset of regularities in data for classification. In the past few years, extensive research was done in the database community on learning rules using exhaustive search under the name of association rule mining. Although the whole set of rules may not be used directly for accurate classification, effec...

متن کامل

Numeric Multi-Objective Rule Mining Using Simulated Annealing Algorithm

Abstract as a single objective one. Measures like support, confidence and other interestingness criteria which are used for evaluating a rule, can be thought of as different objectives of association rule mining problem. Support count is the number of records, which satisfies all the conditions that exist in the rule. This objective represents the accuracy of the rules extracted from the da...

متن کامل

Optimization of Spatial Association Rule Mining using Hybrid Evolutionary algorithm

Spatial data refer to any data about objects that occupy real physical space. Attributes within spatial databases usually include spatial information. Spatial data refers to the numerical or categorical values of a function at different spatial locations. Spatial metadata refers to the descriptions of the spatial configuration. Application of classical association rule mining concepts to spatia...

متن کامل

Using a Data Mining Tool and FP-Growth Algorithm Application for Extraction of the Rules in two Different Dataset (TECHNICAL NOTE)

In this paper, we want to improve association rules in order to be used in recommenders. Recommender systems present a method to create the personalized offers. One of the most important types of recommender systems is the collaborative filtering that deals with data mining in user information and offering them the appropriate item. Among the data mining methods, finding frequent item sets and ...

متن کامل

Reducing Network Intrusion Detection using Association rule and Classification algorithms

IDS (Intrusion Detection system) is an active and driving defense technology. This project mainly focuses on intrusion detection based on data mining. Data mining is to identify valid, novel, potentially useful, and ultimately understandable patterns in massive data. This project presents an approach to detect intrusion based on data mining frame work. Intrusion Detection System (IDS) is a popu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001